Modified Mfcc Methods Based on Kl- Transform and Power Law for Robust Speech Recognition
نویسندگان
چکیده
This paper presents robust feature extraction techniques, called Mel Power Karhunen Loeve Transform Coefficients (MPKC), Mel Power Coefficients (MPC) for an isolated digit recognition. This hybrid method involves Stevens’ Power Law of Hearing and Karhunen Loeve(KL) Transform to improve noise robustness. We have evaluated the proposed methods on a Hidden Markov Model (HMM) based isolated digit recognition system with TIDIGITS data for clean speech and also with noisy speech data. An increase in the recognition accuracy rate is observed with the proposed methods compared to conventional Mel Frequency Cepstral Coefficients (MFCC) technique.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملRegularized MVDR spectrum estimation-based robust feature extractors for speech recognition
In this paper, we present two robust feature extractors that use a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, for estimating the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high vari...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملEffectiveness of KL-transformation in spectral delta expansion
MFCC is widely used together with its delta and delta-delta features in the field of speech recognition based on HMM. MFCC is designed to apply DCT to the MF output. We propose in this paper to employ KL transformation instead of DCT, because it can reflect the statistics of speech data more precisely. MFCC is the compressed feature of the log MF so that some detailed features seem to be lost. ...
متن کاملFeature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction
This paper presents a new feature extraction algorithm called PNCC that is based on auditory. Major new features of PNCC processing include the use of a power-law nonlinearity that replaces the traditional log nonlinearity used in MFCC coefficients, and a novel algorithm to suppress background excitation using medium-duration power estimation based on the ratio of the arithmetic mean to the geo...
متن کامل